29 research outputs found
Solving Continuous Control via Q-learning
While there has been substantial success for solving continuous control with
actor-critic methods, simpler critic-only methods such as Q-learning find
limited application in the associated high-dimensional action spaces. However,
most actor-critic methods come at the cost of added complexity: heuristics for
stabilisation, compute requirements and wider hyperparameter search spaces. We
show that a simple modification of deep Q-learning largely alleviates these
issues. By combining bang-bang action discretization with value decomposition,
framing single-agent control as cooperative multi-agent reinforcement learning
(MARL), this simple critic-only approach matches performance of
state-of-the-art continuous actor-critic methods when learning from features or
pixels. We extend classical bandit examples from cooperative MARL to provide
intuition for how decoupled critics leverage state information to coordinate
joint optimization, and demonstrate surprisingly strong performance across a
variety of continuous control tasks
A Parallel Autonomy Research Platform
We present the development of a full-scale “parallel autonomy” research platform including software and hardware. In the parallel autonomy paradigm, the control of the vehicle is shared; the human is still in control of the vehicle, but the autonomy system is always running in the background to prevent accidents. Our holistic approach includes: (1) a driveby-wire conversion method only based on reverse engineering,
(2) mounting of relatively inexpensive sensors onto the vehicle, (3) implementation of a localization and mapping system, (4) obstacle detection and (5) a shared controller as well as (6) integration with an advanced autonomy simulation system (Drake) for rapid development and testing. The system can operate in three modes: (a) manual driving, (b) full autonomy, where the system is in complete control of the vehicle and (c) parallel autonomy, where the shared controller is implemented. We present results from extensive testing of a full-scale vehicle on closed tracks that demonstrate these capabilities